Statistical Significance of Tree Similarity Scores

نویسندگان

  • Kiyoko F. Aoki
  • Atsuko Yamaguchi
  • Yasushi Okuno
  • Tatsuya Akutsu
  • Nobuhisa Ueda
  • Minoru Kanehisa
  • Hiroshi Mamitsuka
چکیده

New methodologies for performing pairwise tree-matching on carbohydrate sugar chain data were introduced in [3], in which well-known sequence alignment algorithms [12] were extended and an already known polynomial-time graph algorithm for finding the maximum common subtree (MCST) of two trees [7] was used to implement what is called the KEGG Carbohydrate Matcher, or KCaM. These new methodologies are currently available on the web via KEGG Glycan [9, 15]. We make note of some appealing work related to not only KCaM but biological tree-structure matching in general.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Empirical statistical estimates for sequence similarity searches.

The FASTA package of sequence comparison programs has been modified to provide accurate statistical estimates for local sequence similarity scores with gaps. These estimates are derived using the extreme value distribution from the mean and variance of the local similarity scores of unrelated sequences after the scores have been corrected for the expected effect of library sequence length. This...

متن کامل

On the statistical significance of nucleic acid similarities

When evaluating sequence similarities among nucleic acids by the usual methods, statistical significance is often found when the biological significance of the similarity is dubious. We demonstrate that the known statistical properties of nucleic acid sequences strongly affect the statistical distribution of similarity values when calculated by standard procedures. We propose a series of models...

متن کامل

The statistical distribution of nucleic acid similarities.

All pairs of a large set of known vertebrate DNA sequences were searched by computer for most similar segments. Analysis of this data shows that the computed similarity scores are distributed proportionally to the logarithm of the product of the lengths of the sequences involved. This distribution is closely related to recent results of Erdos and others on the longest run of heads in coin tossi...

متن کامل

When is Chemical Similarity Significant? The Statistical Distribution of Chemical Similarity Scores and Its Extreme Values

As repositories of chemical molecules continue to expand and become more open, it becomes increasingly important to develop tools to search them efficiently and assess the statistical significance of chemical similarity scores. Here, we develop a general framework for understanding, modeling, predicting, and approximating the distribution of chemical similarity scores and its extreme values in ...

متن کامل

8 th Annual Institute for Genomics & Bioinformatics ( IGB ) Biomedical Informatics Training ( BIT ) Program Symposium

As repositories of chemical molecules continue to expand and become more open, it becomes increasingly important to develop tools to search them efficiently and assess the statistical significance of chemical similarity scores. Here we develop a general framework for understanding, modeling, predicting, and approximating the distribution of chemical similarity scores and its extreme values in l...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003